Climate change is a hot topic throughout the country and the world. It can be very divisive based on political leanings and various other socioeconomic factors. In this report, we will investigate sentiment on climate change in different regions of the United States by searching through news articles from several publications in different regions of the country.
Below is a graph for the sentiment range for each publication with lower values representing more negative words and higher values representing more positive words.
Below is a table containing the values for term frequency (tf), inverse document frequency (idf) and a combination of the two (tf_idf) for each word in each of the articles for each publication.
Below is a graph for the sentiment range for each publication with lower values representing more negative words and higher values representing more positive words.
Below is a table containing the values for term frequency (tf), inverse document frequency (idf) and a combination of the two (tf_idf) for each word in each of the articles for each publication.
For all 100 articles used for the Northwest and Midwest regions, I combined each article from the five publications from each region into a corpus. To do this, I created a function that would read a pdf file between the words ‘Body’ and ‘Classification’ as this is the format each article was downloaded in from the Nexas Uni website. I used lapply to apply this function to each file in the directory for each publication, creating a table for each publication where each row is the text of every article. I then ran sentiment analysis on these and displayed the AFINN word positivity values for each news publication. To calculate the term frequency and inverse document frequency, I used the table I described above with the text for each publication, calculated the word count for every word in each article, calculated the total words in each article and added the term frequency and inverse document frequency by using the bind_tf_idf function. According to the analysis done here, I have found that the sentiment for most of the publications was relatively neutral, with words being classified mostly evenly between positive and negative values. Additionally, some of the most commonly occurring words in each article and ‘climate’ and ‘change’. Both of these things are likely due to the method of finding the articles and could be made less of an issue with a more sophisticated article search procedure. As next steps, I would recommend a more in-detail search for articles, being sure to control for frequently-occurring words and cover as wide a sentiment range as possible. This could lead to being able to conduct a more meaningful analysis of climate change sentiment throughout the country.